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Abstract. We introduce a class of stochastic models for the dynamics of two 
linguistic variants that are competing to become the single, shared convention within an 
unstructured community of speakers. Different instances of the model are distinguished 
by the way agents handle variability in the language (i.e., multiple forms for the same 
meaning). The class of models includes as special cases two previously-studied models 
^ ] of language dynamics, the Naming Game, in which agents tend to standardise on 

C/3 ' variants they have encountered most frequently, and the Utterance Selection Model, in 

which agents tend to preserve variability by uniform sampling of a pool of utterances. 
We reduce the full complexities of the dynamics to a single-coordinate stochastic model 
(— I ' which allows the probability and time taken for speakers to reach consensus on a single 

O I variant to be calculated for large communities. This analysis suggests that in the broad 

class of models considered, consensus is formed in one of three generic ways, according 
fT^ . to whether agents tend to eliminate, accentuate or sample neutrally the variability in 

\ the language. These different regimes are observed in simulations of the full dynamics, 

and for which the simplified model in some cases makes good quantitative predictions. 
^T) . We use these results, along with comparisons with related models, to conjecture the 

CO ' likely behaviour of more general models, and further make use of empirical data to 

^SJ ' argue that in reality, biases away from neutral sampling behaviour are likely to be 

small. 
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1. Introduction 

Statistical mechanical modelling is increasingly being used as a methodology to 
reproduce and predict large-scale, emergent regularities in human systems. Recent 
examples where good quantitative agreement with empirical data has been observed 
(sometimes reported under the guise of 'agent-based' modelling, which is essentially 
the same approach) include the flow properties of highway traffic [1], some aspects 
of stampedes arising from crowd panic and various distributions relating to firms 
in an economy [3]. The similarity with traditional condensed matter applications is 
that macroscopic properties are obtained by averaging over an ensemble of microscopic 
degrees of freedom. The key difference, however, is that the nature of the underlying 
microscopic interactions is itself a source of uncertainty. Here one hopes to be saved 
by some kind of 'universality principle', which we will loosely interpret as meaning 
that a wide class of systems differing in microscopic details nevertheless display similar 
generic properties at large scales. In this work, we scrutinise the relevance of this idea 
in the context of social dynamics. This is a relatively new, but nevertheless burgeoning 
application domain in statistical mechanics (see |4j for a comprehensive review) that 
promises to contribute to the general quantitative understanding of cultural origins, 
evolution and change O |6] . 

Specifically, we focus on the situation where speakers of a language have a choice 
of two different ways of saying the same thing. Which of these two variants is uttered 
by a speaker at a given time is assumed to be a function of that speaker's exposure to 
utterances produced by herseli|| and other members of her community at earlier times. 
(This is in the spirit of the usage-based approach to linguistics [7j). Over time, the 
relative frequencies with which specific variants are used may fluctuate, and may perhaps 
reach a steady state in which both variants are used with some non-zero frequency, or 
one variant may go extinct. A lot of theoretical attention has been devoted to the 
latter case, and has been described as conventionalisation in a linguistic context [8J, 
consensus in opinion dynamics |4j (in which agents hold one of a number of variant 
opinions whose frequencies change over time) and fixation in population genetics [9]. 
Meanwhile, this process has also been of empirical interest, for example, in sociolinguistic 
studies charting the rise of one set of phonetic realisations of vowel sounds over another 
in such locations as Philadelphia |10] and New Zealand or in the adoption of an 
innovative technology such as hybrid corn in Iowa [12J. 

Even within this fairly restricted context of consensus formation (the terminology 
that we will adopt here), a wide variety of models have been proposed [4]. How this 
space of models is structured is at present unclear. In this work we seek to gain 
some understanding of what this structure might be by introducing and analysing 
a class of models that interpolates between two distinct types of individual agent 
(speaker) behaviour and includes as specific instances contrasting models that have been 

I In common with earlier works, we will use male and female gender pronouns when referring to 
listening and speaking agents, respectively. 
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previously discussed with reference to language dynamics. One feature that all these 
models have in common is neutrality: one phonetic realisation of the vowel appearing 
in the word 'trap', for example, is not assumed to be better suited to the task than 
any other, and hence a priori preferred by all speakers. Where the models differ 
is in the rule used to decide which variant to utter in an interaction. Two generic 
strategies, namely sampling and maximising, are encapsulated. A sampler produces a 
variant with a probability equal to the frequency that she perceives it to be used in 
her community. A maximiser, on the other hand, chooses the variant she believes to 
be used most frequently in the community. In model systems, sampling behaviour is 
represented by the Voter Model [13], the Utterance Selection Model of language change 
[H] and relatives. Meanwhile, maximising can be identified in nonlinear Voter Models 
[T5| [T6l [TTI [T8] and the Naming Game [19] . If interpreted as a tendency to preserve 
or eliminate grammatical irregularity in an artificial language, these two behaviours 
have also been identified in children and adults respectively during psycholinguistics 
experiments [20] and further shown to affect the emergent structure of a language 
acquired by successive generations of speakers [21] . 

In this work, we establish the generic modes of consensus formation exhibited by 
populations of interacting speakers that are differentiated through the behaviour of the 
individual agents that they are composed of. Our strategy is to start with two concrete 
models, namely versions of the Voter / Utterance Selection Model and Naming Game, 
in which agents invoke local sampling and maximising rules respectively. In Section [2] 
we recall the definitions of these models so as to establish that, when restricted to two 
variants, their microscopic update rules differ in two fundamental ways. One involves 
a bias towards categorical use of a single variant that is imposed by the listener in 
any given interaction; the other a similar bias applied by the the speaker. By varying 
the strength of these biases, a two-parameter hybrid model is generated within which 
we find only three distinct modes of consensus formation. The modelling approach is 
similar to that followed in [22] , in which a single-parameter generalisation of the Naming 
Game is constructed by employing one of the microscopic dynamical rules stochastically. 
Despite the fact that the dynamics of this latter generalisation cannot be reproduced 
by special choices of the parameters in the present hybrid model, we find that the same 
generic phase behaviour is common to both families of models, thereby adding weight 
to the hypothesis that the collective behaviour is only weakly affected by changes in the 
microscopic djTiamics (except near a transition point). 

The model's phase diagram we establish initially by examining deterministic 
equations of motion, presented in Section [H We find two distinct regimes, one in which 
consensus on the majority variant is reached, and another in which both variants coexist 
in perpetuity, separated by a line in the parameter space in which the variant frequencies 
do not change over time. The addition of noise allows consensus to be reached for all 
parameter combinations, and it is these stochastic effects that are of greatest interest in 
the context of consensus formation. In particular, we seek to calculate the probability 
consensus on a particular variant will be reached, and the time taken to do so, given the 
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initial condition. Since the full stochastic equations of motion are rather complicated, 
we use observations on the nature of the deterministic trajectories to suggest a means 
to reduce to a single stochastic dynamical variable. These simplified dynamics are 
formulated in Section [5] and analysed in Section [6] for large communities, from which we 
find evidence that the phase diagram is robust to noise. Furthermore, we find that the 
typical consensus time in a maximising community is of order iV In interactions (where 
is the community size), of a pure sampling community of order A^^ interactions, and 
exponentially large in when speakers exhibit an 'anti- maximising' behaviour, i.e., a 
tendency to prolong variability in the language. The validity of the simplified dynamics 
as a proxy for the full dynamics is explored through computer simulation in Section [71 
We find that its predictions for the probability that a particular variant wins out agree 
rather well with simulation, and those for the mean time to do so are qualitatively correct 
but in some regimes display some quantitative differences from the values observed in 
simulation. 

Whilst these consensus statistics are by now well established for the Voter Model 
and its relatives [231 1211 Ull 125], only the deterministic behaviour of the Naming Game 
in the two- variant regime has been treated analytically in previous works [H [221 126] . 
Our findings for the dynamics of the relaxation to the state of consensus in the presence 
of noise therefore complement existing studies of the static critical phenomena seen in 
family of models that include the Voter and Ising Models as special cases [IHl [161 E] • In 
the concluding section, we return to the question of universality and make conjectures 
for the likely consensus properties for more general models of language dynamics, based 
on the insight gained from the hybrid model within the simplified single-coordinate 
approach. Finally, we confront the hybrid model with empirical data for new-dialect 
formation [TTl |27j to demonstrate the possibility that, in a real human system, a 
behavioural bias away from pure sampling behaviour may in fact be quite small. 

2. The hybrid model 

The family of models under consideration has a single community of A^ agents evolving 
by a sequence of interactions between pairs of randomly-chosen individuals. One of 
each pair is designated with probabihty | the speaker and the other the listener. Since 
all pairs of speakers interact equally often, this defines an unstructured or mean-field 
community. Although clearly real communities exhibit structure, this simplification 
provides a substrate upon which we can precisely identify fundamental similarities and 
differences between models, the main motivation behind this work. 

The model language comprises two variant forms for a single meaning. For example, 
two different phonetic realisations of a vowel sound, or two different words for an object. 
We denote these variants A and B, and call their uttered realisations tokens of the 
respective variants. We refer to a speaker's internal representation of the frequency that 
A and B variants are used in the community as her store. In general, samplers and 
maximisers use the information in the store in different ways to decide which variant 
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Figure 1. Illustration of the update rules for the sampling-based Voter and Utterance 
Selection Models. Each set of arrows corresponds to events that occur with equal 
likelihood; the production and replacement events realised in the illustrated interaction 
are indicated by solid lines. 

to use, and respond differently to an uttered token. In the concrete implementations 
that have previously been proposed as models for language behaviour, the Utterance 
Selection Model and the Naming Game, we find that tokens are produced in the same 
way in both models, and the distinction between sampling and maximising is manifested 
after the production event. It is this distinction that will form the basis of the hybrid 
model, which we define after recalling the two distinct limiting cases. 

Sampling behaviour: the Voter Model and its relatives A population of sampling agents 
can be implemented as follows. Each agent retains a fixed-sized store of previously heard 
tokens. When an agent is required to utter a token, she simply chooses one at random 
from the store. The listener meanwhile replaces one of his stored tokens, chosen at 
random, with a copy of the token produced by the speaker. As a consequence, each 
speaker maintains a sample of previously encountered tokens whose relative proportions 
of As and Bs will reflect that of the wider community, albeit subject to noise arising 
from the stochasticity in the choice of speaker-listener pairs, token production and the 
finite size of the store. These dynamics are illustrated in Fig. [H 

If each speaker's store contains only a single token, this model corresponds exactly 
with the Voter Model on the complete graph (fully connected network) or the Moran 
model in population genetics [9l [28]. If the stores are large, and both speaker and 
listener produce tokens in an interaction, and both retain copies of their own and their 
interlocutor's utterances, we recover the Utterance Selection Model [H] . It turns out the 
consensus-formation behaviour of the model is essentially the same, no matter the size of 
the store or whether the roles of speaker and listener are separate or conflated [281125]. In 
a finite-sized community, one variant will eventually go extinct; the probability variant A 
wins out is equal to its initial frequency in a mean-field community; and the distribution 
of extinction times is the same for all models once time has been rescaled by a factor 
that depends on the size of the stores and the precise manner of the speaker-listener 
interaction (as long as the essential sampling behaviour for production and replacement 
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Figure 2. Illustration of the update rules for the maximising behaviour as 
implemented in the Naming Game when restricted to two variants. The two types 
of interaction, success and failure, are shown. After a successful interaction, both 
listener and speaker delete from their stores any instances of tokens other than that 
just uttered; after a failure, the uttered token is added to the listener's store. Sets of 
dashed and solid arrows are to be interpreted as in Fig. [TJ 



of tokens is retained). 

Maximising behaviour: the Naming Game One way to implement a maximising 
behaviour is as formulated in the Naming Game [I9] . In this model, a speaker retains at 
most one token of a given variant in the store. Again, when placed in the speaking role, 
an agent selects one of the tokens from her store uniformly at random for production. In 
this model, maximising behaviour is implemented by the listener: if the token uttered 
by a speaker matches a token in the listener's store, the interaction is deemed a 'success' 
and the listener erases all instances of other variants from the store. That is, the listener 
associates the locally maximal variant (among his store + the uttered token) uniquely 
with the target meaning after a successful interaction. After a successful interaction, 
the association is further strengthened through the speaker also erasing any tokens in 
her store that do not represent the variant she has just uttered. If the interaction is a 
failure (the listener does not have a matching token in his store), his store is extended 
to include a token of the uttered variant. See Fig. [2l 

The hybrid model With just two variant forms, A and B, in the language, speakers in 
the Naming Game behave in one of three ways: produce only A; produce only S; or 
produce A oi B with equal probability. These same three production rules are realised 
in the Voter Model if each speaker's store holds two tokens. We thus denote these three 
states as AA, BB and AB respectively, and further refer to the AA and BB states 
as consistent (since speakers consistently produce a single variant), and the AB state 
as inconsistent. Consensus is reached when all agents are in the same consistent state. 
The same production rule, namely sampling one of the two variants in the store, will be 
applied by speakers in all instances of the hybrid model. 

To place the differing behaviour of the listeners between the two models on the 
same footing, we reformulate the maximising rule that applies in the Naming Game as 
follows: if the hstening agent is in the inconsistent state, he places a copy of the uttered 
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Figure 3. Illustration of an update rule for the hybrid model that includes the Voter 
/ Utterance Selection Model and the two-variant Naming Game as the special cases 
b = c = and b = c = 1 respectively. Labels against the arrows indicate the probability 
the corresponding event (production or replacement) takes place. 



token into his store, overwriting the token that doesn't match. In the Voter Model, this 
token is overwritten with probabihty |. These two rules can be unified by making this 
probability a variable parameter. The prescription in the hybrid model is to replace the 
non-matching token with probability |(1 + 6), where 6 is a maximisation bias parameter, 
which, if equal to recovers Voter-like behaviour, and if equal to 1 recovers Naming- 
Game-like behaviour. Note that negative b,b > —1, is possible; then the listener exhibits 
an 'anti-maximisation' behaviour, i.e., a reluctance to adopt consistent use of a single 
variant. When the listening agent is in a consistent state, the listener update for both 
the Voter Model and Naming Game is the same — viz, random replacement of one of the 
tokens in the store. 

This leaves us with the speaker update rule of the Naming Game to implement. This 
occurs when the speaker's uttered token matches one in the listener's store: the 'success' 
of the interaction is communicated to the speaker, who then adopts the corresponding 
consistent state. In the Voter Model, this never happens; in the Naming Game it takes 
place with probability 1. The obvious way to unify these models is to apply this rule 
with a copy probability c. That is, if the listener is in a consistent state at the end of 
the interaction, the speaker copies that state with probability c. 

Thus we arrive at a two-parameter family of models that interpolates between the 
pure sampling behaviour of the Voter / Utterance Selection Model and its relatives 
(6 = c = 0) and maximisation as implemented in the Naming Game (6 = c = 1). 
These dynamics are illustrated in Fig. [3l It is also convenient to summarise the model 
definition through the set of transition probabilities presented in Table [H In the table 
we also specify the changes in the number of agents in each of the three states 
a G {AA, AB, BB}, and also the changes in the total number of tokens niA and 
across all stores in the community. In the following sections we will use this table to 
derive the deterministic and stochastic components of the dynamics. 
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Table 1. Probability P{(J,X cr', A') that a speaker-listener pair (u, A) makes the 
transition to (cr', A') after having been chosen to interact. The columns headed Suaa, 
Suab and Subb indicate changes in the number of AA, AB and BB agents as a result 
of the transition. Likewise, those headed dm a and SrriB give the change in the total 
number of A and B tokens stored by all agents in the community. 



3. Deterministic equations of motion 

Deterministic equations of motion are obtained for the fraction of agents in state r, 
Xj- = u-r/N , by summing over all possible changes in Ur that can occur in an interaction, 
given the state of the system at time t and weighted by the probability that the change 
occurs. In a mean field community, the probability that a randomly-chosen speaker and 
listener are in states a and A respectively is x^xx (with a correction of order that 
we shall neglect). The probability that this pair then changes state to a' , A', and the 
corresponding changes in Ur can then be read off from Table [H By performing the sum, 
one finds 

5xr = ^Y^ x^xx XI ^ \')5nr{cr, A a'. A') . (1) 

In the large- limit, the changes 5xr are small, and we can approximate the left-hand 
side of this equation as the time derivative Xr- 

The resulting equations of motion are most conveniently stated in the space of 
token frequencies, xa = tt^aI (2-/V) and xb = rnB/{2N) = 1 — xa, within the aggregated 
store of the entire community (hence the factor of 2N) . The dynamics cannot be closed 
in terms of these two frequencies — one also needs to keep the fraction of inconsistent 
speakers xab for a complete description. After some algebra, one finds 

XA{t) =^xab{xa-xb) (2) 



XAB{t) = — 



he o 

2xaXb - {1 + fi) Xab - -^^ab 



(3) 
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Figure 4. Phase diagram for the hybrid model, as predicted by the deterministic 
equations of motion (I2])-(I11)- 



iB{t) =^xab{xb-xa) (4) 

where we have introduced the key parameter 

b + c . . 

the arithmetic mean of the bias and copy parameters introduced to interpolate between 
the Voter / Utterance Selection Model and the Naming Game. One can show that 
with b = c = 1, one recovers the expressions previously presented in the context of the 
Naming Game [IHl H] and a model for bilingualism [29] . 

Although noise has been neglected, these equations involve no further 
approximations. Therefore we may make the following observations. 

If the parameter yU = 0, the mean frequencies of A and B tokens is conserved. This 
behaviour corresponds to that of the Voter Model and its relatives, within which it is 
known that consensus is brought about by a fluctuation that typically occurs after a 
time of order A^^ interactions pO]. We anticipate then that this voter-like behaviour 
will be exhibited not just at the point b = c = 0, but along the hne c = —b (recall that 
b is permitted to be negative). 

For non-zero /i, we see that inconsistent agents present in the community induce 
an effective interaction between A and B token frequencies. If /i > 0, any difference in 
their number is amplified by the dynamics, whereas if /i < 0, a 'restoring force' opposes 
these differences. This suggests consensus will be reached quickly when /i > 0, but will 
never be reached (in the absence of noise) when fi < 0. This allows us to draw a phase 
diagram for the space of models spanned by the parameters b and c where the line /i = 
separates a phase in which consensus is reached rapidly from one in which the onset of 
consensus is delayed (to infinity, in the noise- free dynamics) . See Figure HI As we will 
see below, this phase diagram is retained when we consider a stochastic version of the 
dynamics. 

The fixed-point structure of equations (l2])-(jl]) yields slightly more information 
about these phases. Two fixed points are at {xa,xab,xb) = (1,0,0) and (0,0,1) and 
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correspond to the state of consensus. These are stable when /i > 0. In the deterministic 
dynamics, the token that is initially in the majority always fixes. A third fixed point 
corresponds to a community where {xa^xab^xb) = \) where 



be 



X if 6 = or c = 

otherwise ' 



be -I 



and which is stable when fi < 0. Unless = 1, this community exhibits a mixture 
of AA, AB and BB agents. The AA and BB agents coexist in equal numbers, whilst 
xaa = xbb = ^*ab)- Note that this mixed state comprising consistent and 

inconsistent agents is distinct from that found in models where agents that are too 
dissimilar (e.g., AA and BB agents) cannot interact (e.g., [31]). In those models, one 
finds a frozen state where agents do not change their behaviour over time. Here, by 
contrast, the agents perpetually change state since, for example, if an AA agent meets 
a BB, one of them will be an AB after the interaction. 

A state with this inconsistent (or 'undecided') character was also seen in the 
generalisation of the Naming Game studied in [22]. This generalisation introduces 
to the basic Naming Game a probability p of updating the agents' states after a 
successful interactioii§|. The choice p = 1 recovers the basic Naming Game dynamics. 
By constructing the set of transition probabilities, analogous to Table [1], one finds that 
the microscopic dynamics for general p cannot be realised with a judicious choice of our 
parameters h and c. However, the resulting deterministic equations of motion coincide 
with (E])-® if one takes 

be = p , (8) 

Thus within the allowed range of b, — 1 < 6 < 1, the deterministic dynamics of the 
Naming Game generalisation described in [22] can be replicated only if c is allowed to 
take on unphysical values (c > 1 or c < 0). Nevertheless, the collective dynamics of 
both generalisations (as described in [22] and the following sections) turn out to be quite 
similar. We return to this point in the discussion. 



that is. 



4. Simplified dynamics 

Our main aim in this work is to understand how the stochastic component of the 
dynamics affects the picture described in the previous section. Given the complexity of 
the deterministic equations ([2])-(jl]), it seems unlikely that their stochastic counterparts 
will be analytically tractable. Our strategy is to use insights from the deterministic 

§ In [22 this parameter is called /3, but we use a different symbol here to avoid a clash of notation 
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Figure 5. Trajectories of tiie deterministic system ©-(HI obtained using a numerical 
Runge-Kutta-Fehlberg (RKF45) algorithm 32J. In cases (i) and (ii), /i > so the fixed 
points at {xa, xab) — (0, 0) and (0, 1) are attractors of the dynamics, whereas in case 
(iii) /X < and all solutions approach the fixed point at (|, §)■ 

dynamics to design a simplified dynamics that can be couched in terms of a single 
stochastic coordinate. Then, the resulting Fokker-Planck equation can be studied using 
more-or-less standard methods [33], [9] . 

We begin by examining numerical solutions of the system of equations ([2])-(jl]). 
These are plotted in Fig. [5] for three combinations of the parameters h and c in the 
^A^^AB plane. What is striking is that all solutions begin with a trajectory whereby 
the number of AB agents changes whilst the number of A and B tokens remains roughly 
constant. Then, a common curve xab = fi^A) is followed towards the fixed point. A 
look at the time series (not shown), suggests that the time taken to reach this curve is 
typically less than 10% of the time taken to reach the fixed point. Therefore, within the 
simplified dynamics, we will assume that xab ~ fi^A) for the entire trajectory. This 
will then yield an equation of motion for a single coordinate xa- Unfortunately, we have 
not been able to find an analytic expression for the curve /(x^). However, it does not 
deviate too far from the parabola that passes through {xa, xab) = (0, 0), (|, x^^), (1, 0), 
viz, /(xa) = 2axA{i — Xa) = 2axAXB, where we have introduced a second important 
parameter 

a = 2x\q . (10) 

This approximation can be interpreted in the following way. Suppose first of all that 
there is no correlation between the two tokens in each agent's store: given a randomly 
chosen agent, the first token is A with probability xa-, and so is the second. The 
probability that an agent is inconsistent is then 2xaXb- Thus, the choice a = 1 is 
equivalent to a mean-field type approximation in which we assume that the two tokens 
in each agent's store are uncorrelated. Values of a different from 1 imply correlations 
between the pair of tokens contained in a randomly- chosen agent's store. Specifically, 
the probability of an agent being in the inconsistent state is xab — 2xaXb = 2(1 — a) 
more likely than chance. Alternatively, if one of the stored tokens is an A, the other is 
an A with probability 1 — axs'-, likewise, if one is a B, so is the other with probability 
1 — axA- 
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This latter interpretation provides a means to simulate the simplified dynamics 
directly. One maintains a collective token store with (notionally) = 2Nxa instances 
of A and ms = 2Nxb instances of B. Then, one constructs a speaker in state AA with 
probability — ax^), in state BB with probability 0:5(1 — ax a), or AB otherwise. 
Likewise a listener. Then speaker and listener interact, and the token frequencies change 
probabilistically according to transition probabilities and the 5m values given in Table [TJ 
When /i < 0, it turns out that a > 1, which in turn means that the probability of 
constructing a speaker in a consistent state becomes negative when the corresponding 
token frequency x < 1 — ^. In the event that this occurs, one can simply set the offending 
probability to zero. 

It is not a priori clear whether the choice a = 2x*ab yields a dynamics that is more 
faithful to the full hybrid model dynamics than the mean-field approximation a = 1. 
This we will establish by comparing predictions for these two choices of a with data 
from simulations of the full stochastic dynamics (see Section [7]). 



5. Stochastic equations of motion for the simplified dynamics 

The stochastic equations of motion are obtained by evaluating in addition to ([I]) the 
mean of the square change in token frequencies in the course of an interaction. This 
is evaluated by computing a sum over all possible transitions with the help of Table [H 
analogous to the procedure used to derive ([T]). Since we have reduced to a model whose 
state is given by a single parameter xa, we only need to calculate 



{{SniAf) 



2xaXb + f^XAB + ( 1 + ) C^AB 



fill 



4iV2 

Using ([1]), ffTTl) and approximating xab as 2Q;a;^(l — x^), we obtain a Fokker-Planck 
equation [31] for the simplified dynamics: 

;i + fia)x{l -x) + o?{2 + 36)cx^(l - xf (E) 



where we have replaced xa with x to lighten the notation. Introducing a rescaled time 
variable 

^ - 9« (13) 

and the two parameters 

/3^^ (14) 

1 + /ia 

a2(2 + 36)c 

^ = — ; (15) 

we may rewrite the Fokker-Planck equation as 

^P(x, r) = ^x(l-x)(l-2x)P+-^^x(l-x) [1 + 4(Tx(l - x)] P , (16) 
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where the factors of 4 are for future convenience. The parameter (3 plays the role of 
inverse temperature, controlling the magnitude of the stochastic term. It is proportional 
to the community size A^, indicating that a large community exhibits, if |/xa| ^ 
a low-temperature behaviour in which stochastic effects are expected to act as a 
perturbation about the deterministic dynamics. 

It is helpful briefly to transform the Fokker-Planck equation from the form 

|p(..r) = -^aWP+i|,.WP (17) 

via a change of variable x —>■ y{x) to the form 

|:P(y,r) = ^%)P. (18) 

Making this change of variable in the derivatives of (ITTI) reveals that y{x) is required to 
satisfy 

so that the first-order term vanishes. This is achieved by setting 

Defining the 'potential' V{x) through 

dV 

aix) = -^b{x) , (21) 
which here means 

V{x) = ^\n[l + 4:ax{l- x)] (22) 
(which approaches V{x) = x{l — x) as cr ^ 0), it then follows that 

^ = Ae^^(^) . (23) 
dx 

The constants of integration we fix by mapping the interval a; G [0, 1] to ?/ G [— 1, 1] in 
such a way that y{l — x) = —y{x). Then, 

2/f dwe^^W 

^<^-> = Tdw^ ■ <^^' 

The value of this transformation is that first passage properties [33] can be 
found relatively straightforwardly from the backward equation corresponding to f[T5|) . 
Specifically, the probability Q{x) that the boundary at x = 1 is the first boundary to 
be encountered by the dynamics is give in terms of the transformed coordinate through 
the solution of 

%)^g(z/) = (25) 
subject to the boundary conditions Q(— 1) = and Qil) = 1 [231 E]- The solution is 

Q\y) = -^r- ^ Qw = — n — • (26) 
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Meanwhile, the mean time to reach the boundary x = 1 starting from a position x given 
that this is the first boundary to be encountered is given by the solution of 

h{y)^^Q{y)ny) = ~Q{y) , (27) 

subject to the boundary conditions that T(l) = and T(— 1) < cxo [33], [9]. We find 

f (y) = /(I) - /(y) where I(y) = ^ T du + ^ (28) 

as long as /(e) ~ e as e ^ (a condition satisfied here). In terms of the original variable 
X, this solution reads 

T{x) = /(I) - I{x) (29) 

where 

^ ra.,,-mu, [yi^)-yiu)][i + yiu)] 

2 l + y{x) Jo u{l - u)[l + 4:au{l ~ u)] ' ^ ' 

Analysis of these results will be performed separately within the three distinct regimes 
exhibited by model. 

6. Consensus properties of the simplified dynamics in large communities 

In this section we analyse the large community-size (large-A^) properties of the 
probability and mean time to reach consensus on the variant A within the simplified 
dynamics, given an initial frequency x oi A tokens. At any fixed /i, one will access for 
sufficiently large N a large-/? regime that is dominated by the deterministic dynamics: 
recall that the parameter (3, which is roughly proportional to /lA^, plays the role of 
inverse temperature. Nevertheless, stochastic effects make their presence felt, even in 
this low-temperature regime, as we will see below. On the other hand, if /i scales with A^ 
as /i ~ 1/A^, then both deterministic and stochastic effects are of a similar magnitude. 
Here we can gain some insights into the model's dynamics by perturbing around the 
purely stochastic 0, where relaxation to consensus is dominated by a diffusive 

mode. Comparison of these analytical results with simulations will follow in the next 
section. 



6.1. Low-temperature behaviour of the rapid consensus phase, fi > 

) are dominate 



When /i is positive, so is /3, and integrals of the type appearing in flM|) are dominated by 
the maximum in V{x) at x = ^. To evaluate y{x), we note that by writing x = \ 



W.) = £l„(l + .)-^«^ + o(-L). (31) 

In particular this implies that the integral 

,^'d«e/^^(«)~./3i+3e^^(^) (32) 
V P 
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and, in the central region of width 

erf 



1 



a 



that 

1- 



X — 



with corrections of order Outside this central region we may instead write 



y[x) 



2v^ 



1 + 



7r(l + a) 

2g-/3V{l/2) 



vr/3(l 



V'{x) 



+ 1 



+ 4 



when X ^ ^ 



(33) 

(34) 
(35) 

(36) 



where, to arrive at the last line we inserted 1 = yj into the integrand and integrated by 
parts, and further used that V{1) = and V'{1) = —1. A similar result is obtained for 
a; -C I by invoking the antisymmetry property y{l — x) = —y{x): 



y{x) 



2g-/3F(l/2) 

7r(3{l + a) 



V'{x) 



+ 



when X <ti h- (37) 



An expression for the consensus probability then follows from ( !26ll . 

We now turn to mean consensus time, given analytically for the simplified dynamics 
by the integral (!30|) . Evaluation of this integral is a somewhat more involved enterprise 
than it was for the consensus probability. To avoid an excessively tedious presentation, 
we will omit some steps of routine algebra and focus on the main ideas behind the 
derivation. 

We anticipate that the mean consensus time from a given initial condition x will 
be an increasing function of the number of agents, N, and hence /3. Therefore, we will 
systematically drop any terms in (l30l) that vanish in the limit (3 —>■ oo. For example. 



consider the regime < x < ^, where by the second inequality we mean ^ — x^ -j=. At 



the bottom end of the integral, we have that [1 + y{u)]/[\ +y{x)] vanishes exponentially 
fast with (3. Hence, in f pOj) . the combination behaves as 
y{x)-y{u) [I + y{x)] - [I + y{u)] 



(3^ 



1 + y{x) 1 + y{x) 

in which ep is an exponentially small correction and will hence be neglected. Using ([32 
and (137|) . one can further show that 



2 



dt;e^^(^) [l + y{u)] 



^pViu) 

V'{u) 



1 . 



Substituting into (1301) . and using (l2Til . we find that 

du 



I x) 



m(1-m)(1-2m) 



V'(M)e-^^(") 



(39) 



(40) 



for X < |. In principle we should add to this contributions to (!30|) from the top end, 
u = X. However, by using (1371) . and taking into account that 1 + y{u) and 1 + y{x) are 
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of comparable magnitude in this regime, one ultimately finds that these contributions 
vanish in the limit /3 — > oo. 

This expression has an interesting interpretation. Since the mean time to reach the 
right boundary x = 1 (conditioned on that being the first boundary that is encountered) 
satisfies T{xi) — T{x2) = I{x2) — 'we have, if xi and X2 are both sufficiently far 

away from the left boundary that the term e"^^^"-* can be neglected. 



T(xi) - T{X2) = I{X2) - I{xi) 



X2 



du 



XX u{l-u){l-2u) 



In 



u 



;i -2u)2 



X2 



xi 



(41) 



The right-hand side of this expression can be recognised as the integral J^^ where 
a{u) is the deterministic 'force' term that appears in the Fokker-Planck equation (fT6|) . 
In a deterministic interpretation, this is the time derivative of x, and so the integral 
gives the time taken for the deterministic dynamics to reach xi from the point X2 > Xi 
(assuming X2 < We see then that, when the stochastic djTiamics are conditioned 
on being absorbed at the boundary x = 1, the additional time needed to traverse the 
interval [xi,a;2] is on average equal to that required by the deterministic dynamics, 
even though this pushes the coordinate x in the opposite direction to the stochastic 
fluctuation! 

In the deterministic dynamics, one cannot obtain the consensus time through the 
limit X2 — > 0, as the integral diverges. Typically, given the underlying discrete nature 
of the process, one imposes the cut-off X2 = 1/N. Within the stochastic dynamics this 
cut-off is handled automatically by the term in the square brackets of (HUjl . Integrating 

by parts, 

u(l 



I x) 



In 



In 



In 



(1 - 2u f 



du In 



x] 



1 _ v'e-^"" 



+ 



J 



u(l — u) 



2uy 



(V" - 13V') ' 



-I3V 



x{l 



(1 -2x)2 
xil — x) 



dv In 



+ ln/5 + 7 



(42) 
(43) 
(44) 



:i - 2xy 

where 7 = — f^ dxhixe~^ = 0.5722... is the Euler-Mascheroni constant (see 
§4.331). In the second line we made the change of variable v = (3u, expanded V and its 
derivatives around x = and discarded terms of order 1//3 and lower. 
In the Appendix, we show that the relation 

lim [I{x) + J(l - x) - /(I)] = (45) 
holds in the range |a; — || ^ It then automatically follows that the consensus time 



from an initial coordinate x ^ |, 



i.e., the expression 



has as its leading terms T{x) = /(I) —I{x) = /(I — x), 
with X ^ 1 — X. Then to find the mean consensus time for a 



minority variant, x -C ^, all that is needed is an expression for /(I). 
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In this one remaining integral, the contribution from the range u G [0,x] when 
I — a; <^ is given by (jS]). Likewise, by symmetry of the integrand, the range 

M G [1 — X, 1] contributes the same amount. To estimate the contribution from the 
central part of the integral, we make the change of variable m = | + ^(1 + a)/ [3v. The 
central part of the integral then approaches 

^ r dve''''\l-eif{v)\ (46) 

as /3 ^ oo with vq fixed (so that the range over which the original integral was performed 
becomes ever narrower). In this central region we are justified in using the error function 
( l33l) as an approximation to y{x). At large arguments y/T:e^^\l — erf(f)] ~ and so 
we anticipate that for large vq this integral grows as 4 Info + T, where F is a constant 
we estimate by numerical integration (up to Vq = 200) as F ^ 2.541. Substituting 
X = ^ + ^/(l + cr)/ [3vq into (jH]), and adding this central contribution we find that the 
terms involving vq cancel leaving 

/(I) ~41n/3 + 27 + F-41n2-21n(l + o-) (47) 

plus corrections that vanish with j3. 

We may now summarise the forms of the mean time T{x) to reach consensus on A, 
given that initially a fraction x of all tokens were of A as 

' 4 In /3 + 27 - 5- 2 ln(l + a) x ^ 

31n/5 + 7-5-21n(l + a) -In^^igi 0<x<| 



T{x) ~ <^ 



ln/3 + 7 + ln^g3^ i<x<l 
x^l 



(4^ 



where 5 = 41n2-F^ 0.2316 and 7 ^ 0.5722. 

Recall that we have rescaled time so that one unit of time corresponds to N/fia 
interactions between pairs of agents. Recall also that j3 is proportional to the number 
of agents A^. Therefore, to reach consensus on A, conditioned on this being the final 
outcome, each agent must interact of order InA^ times (ignoring prefactors), no matter 
the initial frequency of A tokens. Naively, one might expect that the time to reach 
consensus on A should increase dramatically from an initial condition in which it is 
in the minority. If the frequency x is interpreted as a particle coordinate, it has to 
overcome a 'potential barrier' at x = |. This event we expect to be exponentially rare 
in the low-temperature (large-/?) limit, which it is, and hence take an exponentially long 
time to occur on average. This would be true if it were not for the fact that absorption 
takes place at the boundaries x = 0, 1. To reach the boundary x = 1 from x < |, the 
particle needs to avoid the boundary x = 0. It seems likely that this is achieved through 
an early-time fiuctuation over the barrier to the region x > |. This would then provide 
a mechanism for all consensus times being of order In A^. 

6.2. Low-temperature behaviour of the delayed- consensus phase: fi < 

We turn now to the case of /i < where, in the deterministic limit, an inconsistent state 
(no consensus) is always reached. The effect of noise is to allow the consistent absorbing 
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state to be reached. Hence then, the inconsistent state is a metastable state, and the 
two consistent states are the true steady states of the dynamics. 

When /i < 0, the earher rescahng of time to arrive at ( fT6|) would involve our taking 
time to decrease towards — cx) as the system involves. Since this is somewhat confusing, 
we make instead the change of variable 

r = \^t (49) 
which leads to the Fokker-Planck equation 

^P(x, r) = —^xil-x)il-2x)P+-^-^xil-x) [1 + 4ax(l - x)] P , (50) 

OT OX \p\ OX^ 

where we note that if fi is negative, then so is /3. We see that the effect of changing the 
sign of fi is to flip the sign of the deterministic term. That is, a particle now has to diffuse 
out of a potential well centred on x = ^ to reach a boundary point x = 0, 1. Here, the 
physical intuition that the time to reach the boundary should increase exponentially with 
\P\ and the well depth is valid. Furthermore, one anticipates that the strong attraction 
towards the potential minimum that occurs when is large leads to the initial condition 
being forgotten, and either boundary ultimately being reached with equal probabihty. 
The exception would be for a particle starting close to a boundary, and which experiences 
an early-time fluctuation that leads to almost immediate absorption. 

We now confirm these expectations explicitly. Recall that the transformed 
coordinate y{x) is given by the integral which can be written as 

/fdue-l"!"'"' 

when f] is negative. Here the function V{x) is still as given by the positive function 
fl22|) : we write the minus signs that appear explicitly. Both integrals are dominated by 
their upper end-point, i.e., 

due^viu) ^ ^m^) ^ 0{l/m^ . (52) 

Hence, when |x — || ^ ~7\p\^ have asymptotically that 



Notice that y{x) differs significantly from zero only in a region of size near the 
boundary points x = and x = 1. Therefore, the probability of consensus on A from 
initial frequency x is, via the expression ([26D, Q{x) = i[l + y{x)], close to one half for 
all X except in these boundary regions, as previously claimed. 

The integral /(x), fl30|) . that appears in the formula T(x) = /(I)— /(x) for consensus 
time, (|29|) . can be written as 



Kx) 



d« ^mv{u) [y{x)-y{u)][l+y{u)] ^ ^^^^ 



1 + y{x) Jo u{l- u) [1 + 4crM(l - u)] 
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where we have used ( 15^ to evaluate the first integral in (I5U]) to leading order in |/3|. 
Note that the units of time here are those introduced earlier in this Section. 

If X — I ^ 1/\J\^\, the integral is dominated by the midpoint m = | where the 
function V{x) has a maximum. In this region, y{u) is roughly constant and close to 
zero. Hence, to leading order in 

I{x) ^^^el^l^(^) r du e-l\PV"im--\? (55) 

Vl/?|(l + a)l + y(x)' • ^''^ 

Contributions from either end-point are subdominant, and we see that because y{x) 
decays rapidly to zero near x = 1, the dominant contribution to the difference 
T(x) = /(I) — I{x) is from /(I) when the distance from the right boundary is much 
larger than When x < |, the mid-point does not contribute at all, and one has 
only the subleading end-point contributions to the integral, and hence one finds that 
for sufficiently large the consensus time is roughly constant over the intermediate 
range of x. That is. 



where we used the fact that y(x) ^ 1 as x — 1. Note that in the limit a — 0, we have 
T{x) ~ W-^e^ . (58) 



This analysis thus confirms our expectation that the coordinate x is typically pinned 
to a value of approximately | for a time that increases exponentially with the inverse 
temperature \j3\. 

6.3. The crossover regime: ~ 

As previously discussed, one can define a crossover regime between the rapid- and 
delayed- consensus phases in a large community of size N if the magnitude of /i ~ 
Then (3 is of order unity and both the deterministic and stochastic terms in the Fokker- 
Planck equation f|T6|) are of similar magnitude. 

To study this crossover, we take ^ = v /N , which in turn implies that both terms in 
the original Fokker-Planck equation (fT2l) for the simplified dynamics are of order l/N"^. 
Thus here we should rescale time through 

so that then in the limit N ^ oo the Fokker-Planck equation reads 
d d 1 

—P(x, t) = ua—x(l-x)(l-2x)P+-—^x(l-x) [1 + 4ax(l -x)]P , (60) 
or ox 4 ox'^ 

where, in this limit, 

(61) 
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There are two distinct cases that arise in the infinite-community hmit, N ^ oo. 
Case I : If both b and c vanish at least as fast as 1/N, so that 

. = li„ (62) 

Af^oo 2 

is not infinite, the parameter a = 2x\q, where x*^^ is given by Qj, tends to 1 and 
(T — s> 0. Then the Fokker-Planck equation simphfies to 

^P(x, r) = z/^x(l - x)(l - 2x)P + \-^x{l - x)P . (63) 
OT OX 4 ox^ 

Repeating the analysis following Eq. (fT6|) leads to the following modified expressions for 

the transformed coordinate 

2 /f e'^''<^-'^) 

y(-) = ;id.e-(-«) ' ^''^ 
that appears in the consensus probability Q{x) = |[1 + y{x)] and the integral 

^^""'-^ l + y{x) Jo u{l-u) ^^^^ 

that appears in the consensus time T{x) = /(I) — I{u). 

To get a feel for the crossover in this case, we expand about u = (the Voter 
Model) to first order in u (although higher-order corrections could also be computed). 
We find 

Q[x) =x+'^x{l-x){2x-l)u + 0{u^) (66) 

^, , (1 — x) ln(l — x) 
Tix) = - 4^ — ^ + 

X 

' -x{2x -7)-—{l-x) ln(l -x))u + 0{u^) . (67) 



X V9 ' '3 

As expected, consensus probability increases slightly for the majority variant when u 
is small and positive. Intriguingly, to first order in z/, consensus on the A variant from 
any initial fraction of A tokens is reduced relative to the purely diffusive case, 1^ = 0, 
when u is positive. Again one might expect an increase in consensus times for minority 
variants. This we ascribe once again to the conditioning on trajectories that lead to 
consensus on A when calculating T{x). 

Case II: The other possibility is that b and c approach nonzero constants that have the 
same magnitude but opposite sign, i.e., 

b' c' 
^ ~ -c* + T77 and c ~ c* + — , (68) 

where 5, e > 1. In this case, both a and a have nontrivial limiting values 

2 / / r-^\ , a2(2-3c*)c* , , 

« 7^ (l - - and ^ ^ ' . (69) 
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The integrals for y and / are not as simple now: 




2 /f dMe'^^(") 

•'2 



(70) 



■v{u) [y{x) - y{u)][l + y{u)\ 
u{l - u)[l + Aauil - u)] 



(71) 



where here the potential is 



V 



a 



ln[l + Aax{l - x)] . 



(72) 



In principle an expansion to first order in v is possible; although in practice the resulting 
expressions are rather cumbersome and one does not gain any new insights. 

7. Comparison with Monte Carlo simulation 

In deriving the analytical results of the previous section we have made two 
approximations: (i) that the community size is large, and thus that we are justified in 
keeping only the leading terms in an asymptotic expansion; and (ii) that the dynamics 
is well described by the stochastic evolution of a single coordinate, obtained through 
the simplification described in Section HI In this section, we examine the effect of 
these approximations independently by Monte Carlo simulation of (i) the simplified 
dynamics, using the algorithm outlined in Section |H which will reveal any large finite- 
size contributions and (ii) the full dynamics, i.e., implementing the stochastic update 
rules presented in Table [H which will show to what extent the simplified dynamics acts as 
a proxy for the full dynamics. Our main findings are that consensus probabilities appear 
to be well predicted by the simplified dynamics for both models, but do not provide a 
complete quantitative account of the mean consensus time for the full dynamics outside 
the crossover regime. 

7.1. Finite-size behaviour of the simplified dynamics 

In the regime > 0, the consensus probability as a function of the initial A token 
frequency, x, is predicted to follow an error-function curve, erf^, in the rescaled variable 



^ = \l P / + ~ \)- This prediction is easily tested by plotting the fraction of 
realisations of the dynamics starting from given token frequency x that reach the 
boundary function of ^ for different combinations of model parameters. 

Fig. [21(a) shows these data for a range of combinations of h and c corresponding to 
;x = i(6 + c) = {i, |, 1} and system sizes = {250, 500, 1000, 2000}. Agreement with 
the error function, shown clS 3i solid line, is good. 

Meanwhile, when /i < 0, the theory predicts that Q{x) ^ except in boundary 
regions of size where one would expect to see the exponential decays given by 
The data shown in Fig. E](b) for the case /x = —0.025 (obtained through the 
particular parameter combination h = —0.65, c = 0.6; all combinations give similar 
results) are consistent with these expectations as long as we take a = 1, rather than 
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Figure 6. Consensus probability Q as a function of initial token frequency x within 
the simplified dynamics. Points show results from direct Monte Carlo simulation of 
the simplified dynamics for a range of different system sizes and choices for the bias 
and copy parameters; curves show analytical predictions, (a) In the rapid-consensus 
phase, /i > 0, data are expected to collapse onto an error function after change of 
variable ^ — y/p/(l + a){x— \). Each cloud of points contains results from 36 different 
simulation conditions (see text) , and the two parameters a and a are as given by IjlOp 
and (jlSp . Errors are approximately the same size as symbol sizes (slightly larger in 
the inset) and have been omitted for clarity, (b) In the delayed-consensus phase, 
jjL < the consensus probability is predicted to approach i in the central region, 
with deviations in boundary regions of size of order Solid lines show a fit to the 

predicted boundary behaviour (|53[) with the parameter setting a = 2x'^g ; dashed lines 
the prediction with a = I. 

a = 2x^5, in Eq. ( l53l) . This is despite the fact that the parameter a enters into the 
simulation algorithm, and was set to the latter value. One possible explanation for the 
better fit with a = 1 is that the consensus probability near the boundaries region is 
dominated by the dynamics in those regions. There, as observed in Section HI it was 
necessary to artificially set the agent frequency xaa to zero should the token frequency 
xa become sufficiently small that the approximation xab = 2q;xa(1 — xa) demand a 
negative frequency of AA agents. This is then equivalent to insisting that xab = 2xyi, 
i.e., a value of a = 1 in these boundary regions. 

We now examine numerical data for the mean consensus time in the simplified 
dynamics. The results fj^8l) indicate that when > 0, T(x) should converge to a step 
function with height 3 In /5 at x < | and In /? at x > | as iV — > oo. Different choices of the 
bias and copy parameters give very similar results: we show as a representative example 
the case 6 = c = | in Fig. [3 The data suggest that this convergence will be achieved 
in the region x > ^, albeit slowly. Even for the largest system simulated, ln/3 is smaller 
than 10, and therefore we expect the 0(1) corrections in ( l48l) still to be significant. 
In the region a; < |, the situation is less clear. As we have seen, the probability of 
reaching consensus on A in the regime x ^ |, for which the consensus time behaviour 
has been calculated, vanishes exponentially with system size. Therefore it is difficult to 
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Figure 7. Leading large- iV behaviour of the mean consensus time T{x). The dashed 
hne shows the step- function onto which T(x)/ln/3 should converge as — s- oo. The 
points show data from Monte Carlo simulations with h = c — ^ and various N . Each 
point is an average over lO^/iV realisations of the dynamics from the initial condition; 
the error bar shows the standard error on this mean, and is larger for x < ^ because 
fewer realisations reach consensus on A when it is initially the minority variant. 



sample the large- behaviour in the stochastic simulations and one cannot state with 
a high degree of confidence that the simulations reproduce the predicted step-function 
form. The fact that at the midpoint a; = |, T{x) appears to be converging onto 21n/3 is 
perhaps suggestive that this asymptote might eventually be realised, given a sufficient 
quantity of data for large system sizes. 

We reach similar conclusions when we examine the 0(1) contributions to T{x). In 
Fig. [SI we show for a range of {b, c) combinations the difference between T{x) and the 
leading logarithmic term. The fit to the function 7 -|- ln[a;(l — x) / {2x — 1)^] is good in the 
region x > ^. Again, the data are not necessarily inconsistent with the prediction in the 
range x < ^, but again, we have the problem that the result (HHj) is valid in the regime 
where \ — x ^ 1/ which is precisely the regime where the fixation probability is 
very small. For example, in the plot of Fig. [H](b), each tick mark on the horizontal axis 
roughly corresponds to l/(2y^) for the parameter combinations shown, and we see that 
the combination v^(| ~ ^) rarely exceeds a value of 3 in practice. 

Data for the consensus time in the delayed- consensus phase are plotted in Fig. [9l 
divided by the prediction so that as — 00, one would expect data to converge 
to the line y = 1 (except within a region of size 1/A^ near x = 1). Combinations 
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Figure 8. Difference between the mean consensus time T{x) and the limiting N oo 
prediction (a) T{x) ^ ln/3 when x > ^ and (b) T{x) ^ 31n/3 when x < The solid 
lines show the functions to which the data should converge as ^ oo, given by (|48p. 
Points show simulation data for various combinations of b and c at iV = 2000. Error 
bars are as described in the caption to Fig. [71 



of model parameters other than those used in the figure {b = —0.65, c = 0.6) show 
a similar approach to this line, so taken together we would suggest that the predicted 
convergence is refiected in the numerical data. Certainly we find a more convincing 
convergence when the parameter a = 2xaB* (the value used for the figure) than when 
a = 1 (data not shown). This is consistent with our earlier observation that this would 
be the appropriate choice for a in the delayed-consensus regime for quantities dominated 
by the dynamics in the central region (as opposed to the boundaries). As we saw in 
Section [HI the dominant contribution to consensus time is the time needed to escape the 
potential well that is centred at a; = | and whose depth is large compared to the scale 
of the fluctuations. 

To summarise, analytical results for the consensus probability obtained in the large- 
N limit appear to give a good fit to simulation data for finite systems in both the 
positive and negative /i regimes, albeit with a modified choice of a = 1 to describe the 
boundary behaviour in the /i < regime. The predicted mean consensus time behaviour 
is reproduced for /i > and x ^ |. When /i > and x -C |, consensus on the A 
variant is sufficiently rare that it is difficult to discern whether the predicted behaviour 
is observed; and when /i < 0, the finite-size effects seem still to be strong over the 
range of N within which consensus occurs quickly enough to be seen in simulation. One 
possible way to strengthen these conclusions would be to use specialised techniques for 
sampling rare events (such as forward fiux sampling [36]), which we leave as a possibility 
for future work. 
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Figure 9. Consensus time T{x) divided by the prediction T* given by the right-hand 
side of (|57p for the /i < regime. The specific parameter choices used were b — —0.65, 
c = 0.6 (and hence ji = —0.025) with 10^/A^ repetitions of the dynamics from each 
initial condition. 



7.2. Correspondence between the simplified and full dynamics 

The results for the consensus probability and mean consensus time are given in terms 
of the initial overall frequency x oi A tokens. In the full dynamics, a single value of 
X may be realised through various relative numbers of AA and AB agents. To test 
whether the predictions based on the simplified dynamics extend to the full dynamics, 
we must, for any given value of a;, choose several compatible initial conditions. We 
ran simulations with initial xab = {0, |xmax, a^max} where Xmax is the maximum possible 
fraction of AB agents that can be realised for given x. For a; > |, Xmax = 2(1 — 2;). Once 
X and Xab are specified, xaa is given by x — \xab, and xbb through the normalisation 
xaa + Xab + Xbb = 1- 

The first test of the predictions is whether the error-function form of the consensus 
probability Q{x) applies when > 0. Fig. [10] shows simulation data obtained for 
various b and c and, for each x, the three initial conditions just described. The data 
fit the error function reasonably well, but closer inspection shows that whilst the fit 
is best for the set of data that have the initial xab = a^max/2, those for xab = and 
Xab = a^max consistently lie respectively below and above the predicted curve in the range 
X > ^. Presumably these discrepancies are due to the fact that these initial conditions 
most strongly violate the assumption that Xab = 2a;XA(l — Xa), invoked to obtain 
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Figure 10. Consensus probability Q{x) within the full dynamics. In both figures, 
different symbols relate to different initial fractions of inconsistent {AB) agents for 
a given value of x. Errors are approximately the same size as the symbols in 
the main figures. (a) In the rapid-consensus regime /i > 0, a; is rescaled via 
^ ~ ^/JT/(T+a){x~ i) as in Fig. El Points show results from Monte Carlo simulations 
within communities of iV = 16000 agents, each initial condition repeated 3125 times. 
The solid curve is the prediction from the simplified dynamics. The values of a and a 
used to transform from a; to ^ are as given by (jlOp and (|15p . (b) The corresponding 
data for the delayed-consensus phase, ^ < 0. As in Fig. [51 solid lines show a fit to 
the predicted boundary behaviour ([55)1 with the parameter setting a = 2x^^; dashed 
lines the prediction with a — I. 



the simplified dynamics. A better prediction for the function Q{x) generated by the full 
dynamics would therefore somehow need to take the initial number of inconsistent {AB) 
agents into account. We remark that transforming these data with a = 1, rather than 
a = 2x\^, yields a worse overall fit to the error function. This is in accordance with our 
earlier observation that quantities dominated by a; ~ | (which, in the analysis, is where 
the error- function form comes from) are better predicted by the choice a = 2x\^. In 
the delayed-consensus phase, < 0, we find — as with the simplified dynamics — a better 
fit of ( |53l) to the data near the boundaries is achieved by taking a = 1, as is shown by 

Fig.mb). 

We now examine how well predictions for the mean consensus time from the 
simplified dynamics carry over to the full dynamics. Whilst data for the simplified 
dynamics were not inconsistent with asymptotic convergence of the function T(x)/ln/3 
onto the predicted step function, those for the full dynamics are far less convincing, as 
shown by Fig. [TlTa). Taking the point at which the curves intersect as a guide to the 
height of the step in the region x > |, we see that the prediction fHHl) overestimates 
this by about 30%. It is natural to hypothesise that the discrepancy is due to the 
initial condition being incompatible with the restriction xab = 2axA{^ — xa) used to 
formulate the simplified dynamics. However, we find that the simulated consensus time 
data (not shown) are largely insensitive the initial value of xab- Instead, we find that 
different combinations of b and c that have the same value of /x = ^ give roughly 
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Figure 11. (a) As Figure [7] except that the points were obtained from simulations of 
the full hybrid model dynamics, not the simplified dynamics. The condition b = c — ^ 
is the same; the initial fraction of inconsistent AB agents is xab = f — and each 
initial condition was repeated 5 x fO^/iV times, (b) Data from the same simulations, 
but at fixed N ~ f 6000 and various combinations of h and c. 



similar consensus times, and that the main variation is between different values of /i: 
see Fig. ^h). 

What is remarkable is that the shape of the consensus time function T{x) obtained 
by simulation is consistent with the analytical expression 7 + lnx(l — x)/{2x — 1)^ 
predicted by the simplified dynamics. To see this, we plot T{x) for different /i, and 
shift each data set by a //-dependent constant k^: see Fig. [121 We have also observed 
that this constant is independent of the community size N if one makes the mean-field 
approximation a = 1 (rather than a = 2x*^^). Again this could be a consequence of 
the leading In (3 contribution to the consensus time originating from behaviour near the 
boundary, specifically, the linear vanishing of both the deterministic and the diffusion 
terms in the Fokker-Planck equation (fT6|l . 

Similar trends are displayed by data obtained for the delayed-consensus phase: 
the true consensus time seems to fall below that predicted by the simplified dynamics 
and and is insensitive to the initial condition, as can be seen from Fig. [13] for the 
specific parameter combination b = —0.65 and c = 0.6. Other parameter choices do not 
necessarily show the undershoot evident in Fig. [TS] for example, with b = —0.05, c = 
(not shown), convergence to the consensus time given by (1571) is more convincing. Taking 
all the data for different combinations of b and c together it is not clear whether taking 
( [57]) with a = 1, rather than a = 2a;^^, gives overall a better fit. We therefore conclude 
that although the data for the full dynamics are consistent with an exponential growth 
in the consensus time with the community size A^, predictions from the simplified theory 
cannot be confirmed or rejected without access to data for larger system sizes. 

To finish, we briefly examine the crossover regime, |yu| ~ 1/A^. In Section [6l we 
found an expansion around the pure Voter Model behaviour as a series in a parameter u 
defined by (l62l) for the case where both b and c vanish with as 1/A^ (Case I). To test 
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Figure 12. Same consensus time data witliin tlie full dynamics as in Fig. [TTJb), 
but instead of being scaled by a factor ln/3, shifted by an amount /c^ to obtain 
a roughly common curve (as judged by the eye). The dashed line is the function 
lnx(l — x)/(2x — 1)^ predicted by the deterministic part of the simplified dynamics. 

the validity of this expansion within the full dynamics, we plot the difference between the 
measured consensus probability and time from the Voter Model values — i.e., the leading 
order terms in ( l66l) and ( l67|) — divided by v and compare with the coefficient of the first- 
order terms. This we did for = 1600 and a range of both positive and negative. 
For the positive u values, we took h = c = v/N; for the negative values, h = —Su/N and 
c = u/N. The desirable range of u is such that it is sufficiently small for second order 
effects to be negligible, but sufficiently large that deviations of Q{x) from the = 
limit Q{x) = X are not swamped by the noise. To this end, we plot in Fig. [T^a) the 
ffist-order term in fl66|) . the expected mean value of the sampled quantity, and around it 
the expected standard deviation of the data around this mean corresponding to the two 
different values of |z/| that are displayed. The data are consistent with these intervals. 
The consensus time data. Fig. [TW b). when errors are taken into account are also broadly 
consistent with the prediction from the simplified dynamics but, in common with the 
other data discussed in this section, to a lesser degree than the consensus probability 
data. 

In summary, then, we find that the simplified dynamics acts as a good proxy for the 
full dynamics as far as the consensus probability is concerned, and that the additional 
freedom in the distribution of A tokens between consistent AA agents and inconsistent 
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Figure 13. Measurement of the consensus time T{x) witliin tlie full dynamics for 
b = —0.65, c = 0.6 and hence /i = —0.025. As in Fig. [9l T{x) has been divided by the 
prediction T* given by the right-hand side of (|57|) . The different symbols correspond 
to the three different initial conditions, and different colours to different system sizes. 
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Figure 14. (a) Deviation of the function Q{x) from the v = Q limit within the full 
dynamics in a community of TV = 1600 speakers and 6250 repetitions from each initial 
condition. Dashed line shows the expectation value for each point; the inner and outer 
intervals one standard deviation around this for \v\ = 1 and \v\ — 0.125 respectively, 
(b) The analogous data for the consensus time function T{x), this time with standard 
error on each sampled mean displayed explicitly. 
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AB agents has a fairly small effect on the consensus probability function. The consensus 
time, meanwhile, seems to be insensitive to different ways of setting up an initial 
condition with overall A token frequency x, but depend on the parameters h and c in a 
way that is not predicted by the simplified dynamics. This is particularly evident in the 
regime /x > and x > \, where — up to a //-dependent shift — the shape of the consensus 
time function is well-described by the function arising from the simplified dynamics. As 
with the simplified dynamics, we find that quantities governed by behaviour near the 
boundaries are better fit by taking a = 1, and that finite-system and -sample size effects 
hamper our ability to make strong conclusions in the regimes /i > 0,x < \ and /i < 0. 
Meanwhile, the behaviour in the crossover regime of the full dynamics appears to be 
well-captured by the perturbative expansion performed within the simplified dynamics. 

8. Discussion and outlook 

In this work, our aim has been to obtain a generic understanding of consensus formation 
in multi-agent systems by bringing together hitherto disparate statistical-mechanical 
models of language dynamics. By restricting to the empirically-relevant case of two 
variant forms that are competing to become the single convention shared by all members 
of a community, we have been able to unify the dynamics of the much-studied Voter 
Model and its relatives (one of which is the Utterance Selection Model [13]) with that 
of the Naming Game [Hj- Our analysis shows that the specific implementation of 
maximising behaviour employed in the Naming Game leads to an effective repulsion 
in the frequencies of the A and B variants in the community mediated by what we 
have called inconsistent agents, i.e., those who use both variants equally often. Certain 
parameter choices correspond to an 'anti- maximisation' behaviour, in which agents are 
reluctant to abandon the notion that either variant may equally well represent the target 
meaning. A similar, but distinct, update rule had previously been implemented in a 
model due to Baronchelli et al [22], and in both the hybrid model family and that 
described in [22] the same generic phase structure is seen. In one phase, consensus 
on the majority variant is reached, and in the other, both variants coexist with equal 
frequency. This phase diagram was found to be robust to the addition of noise, which 
permits consensus even among anti-maximising agents, albeit after a time that grows 
exponentially with the community size. 

The analysis of the stochastic dynamics was achieved by replacing the fraction of 
inconsistent agents, a stochastic variable, with a deterministic function of the frequency 
of A variants in the community. Despite the somewhat crude and uncontrolled nature 
of this approximation, we found from Monte Carlo simulations that the consensus 
probability from a given initial condition was nevertheless well described, and although 
the growth of the consensus time with community size (logarithmic and exponential 
in the rapid and delayed consensus phases, respectively) predicted by the simplified 
dynamics was seen in simulations, the simplified model does not provide a precise 
quantitative description of the consensus time behaviour seen in the full dynamics. 
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Rather than dwell on the inadequacies of the simplified dynamics as a proxy for 
the full dynamics, discussed at length in Section [71 we will now instead consider what 
light the present study sheds on linguistic consensus-formation processes in general. It 
is interesting to note that the form of drift and diffusion terms in the Fokker-Planck 
equation ( JT6l) are precisely what one would write down in an abstract formulation 
following the spirit of Landau free energy theory. That is, they are both the lowest- 
order polynomial expressions that respect the symmetries and boundary conditions of 
the problem. For consensus to be an absorbing state, it is necessary for both to vanish 
when a single variant remains. Furthermore, if we restrict ourselves to neutral theories, 
i.e., those in which A and B variants are distinguished only by their frequencies, we 
require that the drift term is antisymmetric around x = ^, and that the diffusion 
term is symmetric. The expressions x(l — — 2x) and x{l — x) + crx^(l — x)^ are, 
indeed, the lowest order terms in a Landau-like expansion with these properties. It 
is reassuring that these expressions were obtained for concrete models that have been 
specifically introduced to study language dynamics. Furthermore, precisely these forms 
are also found for the distinct class of models discussed in |22] (with appropriate choices 
of the parameters fi and cr). Thus we should expect to find for those dynamics too 
that at a finite system size A^, a cross-over to Voter-Model-like behaviour emerges at a 
distance of order from the transition point. Given that these different microscopic 
dynamics lead to the same effective description, and that the two distinct maximising 
rules in the Naming Game enter into the expressions in a similar way, we anticipate that 
other concrete models of maximising behaviour are likely to lead to similar macroscopic 
dynamics, at least in mean-field communities. 

The key properties of these expressions that dominate the collective behaviour of 
the hybrid model are: (i) the presence of a maximum or minimum in the potential 
defined by Eq. fl2T]) : and (ii) the manner in which the drift and diffusion terms vanish 
at the boundaries. A potential maximum leads to the error-function form of the 
consensus probability in the rapid consensus phase (via a harmonic approximation at 
the maximum), whilst a minimum ultimately gives rise to the exponential growth with 
inverse temperature in the time needed to escape it. The vanishing drift term leads to a 
divergence in the consensus time which is controlled by a cut-off governed by the form of 
the potential near the boundaries, as demonstrated by Eq. ( l40l) . A leading linear decay 
of the drift term at the boundaries would appear always to imply a logarithmically- 
growing consensus time; we anticipate that other leading terms will cause polynomial 
growth of the consensus time with population size. Meanwhile, a potential containing 
multiple maxima and minima will likely lead to a model that displays a mixture of 
the behaviour we have seen here, that is, consensus probability functions that take 
error-function forms around potential maxima and exponentially growing contributions 
to consensus times arising from each minimum (along with logarithmic or polynomial 
contributions from boundary regions). To firm up these speculative conclusions, it 
would perhaps be worthwhile in the future to investigate more systematically the range 
of possible emergent behaviour arising from different forms of the drift and diffusion 
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Diphthong shift 
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Rounded LOT vowel 
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DANCE vowel 


0.52 
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Fronted and lowered STRUT 


0.34 
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Table 2. Numerical estimates for initial frequencies of conventionalised variants in 
the formation of the New Zealand English language dialect, taken from the indicated 
pages of p7]. 

terms, and also to develop a more controlled scheme for reducing a multi-dimensional 
dynamics to a single coordinate, within which such concepts as a potential take on a 
straightforward interpretation. 

A number of studies have been devoted to establishing connections between the 
Voter and Ising models [T6| ITF] , and in particular their universal critical phenomena. 
In the Ising model at low temperatures, spins preferentially align with their neighbours, 
which is similar in spirit to the local maximisation behaviour of agents in the Naming 
Game. Here it has been found that Voter-type coarsening lies at a boundary between 
phases characterised the presence and absence of order. The phase diagram we have 
found here has rather similar characteristics, if one interprets the metastable state of 
the delayed-consensus phase as a disordered phase. In contrast to other works, our focus 
has been on statistics of the transit to an absorbing state in a finite system, as opposed 
to universal features of the stationary state [15] or of coarsening and persistence p!6l [T7] 
displayed by infinite systems. It is not clear (apart, perhaps, from the overall timescales 
involved) that these contrasting properties are straightforwardly related. 

An important question left open by our work concerns the effect of community 
structure on the consensus dynamics. The behaviour of the Voter Model in models with 
almost arbitrary structure is by now well understood, since in that case the essential 
contribution to the dynamics is captured by a single 'collective coordinate' |371 [231 
EH [25] . On the other hand, it is known that a system evolving by zero-temperature 
Glauber dynamics (which corresponds to an extreme form of maximising behaviour) 
on finite networks can become trapped in disordered metastable configurations [3Bj. It 
would be interesting to examine the finite-temperature of the present model on more 
general network structures. One possibility here might be to try and expand around 
the analytically tractable Voter Model as a means to understand the crossover regime 
between the two phases. 

One may also legitimately ask what kind of parameter settings best describe a real 
human system. Here, relevant data are thin on the ground, although one pertinent set 
is provided by a thorough study of the emergence of the New Zealand English language 
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Figure 15. Probability of observing the set of outcomes documented in the formation 
of the New Zealand English dialect [TTJ [57] within the hybrid model given the 
combination A = appearing in the consensus probability function Q{x). 

dialect [TTl [27]. This was formed through contact between different British and Irish 
dialects that arose through immigration to New Zealand in the mid 19th Century. In 
|27] . seven features that initially exhibited variability are identified with estimates of 
the initial frequency of the surviving variant. These data are summarised in Table [2j 
Assuming that the features evolved independently, we can ask how likely, given some 
combination of h and c in the model, consensus would have been reached on the set 
of variants with the initial frequencies Xi specified in the table by calculating Y{iQ{xi). 
We can then maximise this probability with respect to the model parameters to identify 
the instance of the hybrid model that best describes these empirical data. For large 
A^, Q{x) depends in its central region on h and c through the parameter combination 
\ = I3/{1 + a) in both the rapid- and delayed-consensus phases. So, in fact, the best 
we can do is find the maximum likelihood value of A. The likelihood as a function of 
A is plotted in Fig. [151 from which we see that the data are best described by a value 
of A that is negative and of order unity. Assuming that /3 is an increasing function of 
population size in structured as well as unstructured populations, and given that the 
number of New Zealand English speakers during the relevant historical period was over 
10^, one may argue that if the community is well described by a single maximisation 
strategy, it is, in fact, a weak bias against maximisation — perhaps as small as 10~^, if 
/? oc A^ as it is in an unstructured community. This observation may justify the use of 
the Utterance Selection Model as a means to critique theories for new-dialect formation 
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in New Zealand |39j, although this should be cautioned by the fact that models allowing, 
for example, individual differences between speakers, and for their strategies to change 
over time (as suggested by the experimental study of [20]), may fit the data better. 
These extensions to the model we leave as further possibilities for future study. 
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Appendix A. Symmetry relation for consensus time integral 



In this Appendix we show that the integral /(x), defined through fl30l) . satisfies the 
relation fl45l). 



lim + = 



in the range of x that satisfies ^ x ^ 



1 1 



(A.l) 

as the limit is taken. We begin by 



noting that the integrand appearing in /(I) is symmetric about u = |, and so we may 
write 
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is a constant independent of /3, h{u) 
l\n[l + Aau{l-u)]. 
We may then write 
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To arrive at these expressions we have used the antisymmetry property y{l—x) = —y{x). 
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Rearranging, 

2 1 — y[x) Jx b[u) 

In the integral for Jix), we have that both x and u are much less than \ \= and so 

we may use ( !37|) to approximate both and y{u). Bearing in mind also that a; ^ 
we find to leading order 

- 1 (A.9) 



2 g-/3V(x) Q-PV{u) 

J{x) ~ ^ / d-u- 



V^'(x) Jo h{u) [V'{u) 

This integral is dominated by its upper endpoint (the integrand vanishes linearly with 
M as M — * 0), and so we can drop the 1 that appears in the square brackets. Then the 
leading behaviour of the integral is 

2 1 /-x qI3[V{u)-V{x)] 

A^) T^TTTTT / dtx . (A.IO) 

Expanding around the endpoint u = x, we find that J{x) ~ 0(/3^^/^). 

For the K{x) integral meanwhile, we find with a similar approximation for y{x) 

that 

1 1 rl-x Q-l3[Viu)~V{x)] 



2/tvp K (x) o(n) 
The dominant contributions to this integral come from the endpoints, and so by 
performing the same expansion as previously we find similarly that K{x) ~ 0(/3^'^/^). 
Hence inserting into flA.4l) we find that the limit in flA.ip is zero, as required. 
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